IN THE CLAIMS 



1 . (Currently Amended) A method comprising: 

responsive to receiving a single packed shuffle instruction designating, 
with 3 bits, a first register storing a first operand having a set of L 
data elements and designating, with 3 bits, a second register storing 
a second operand having a set of L control elements , wherein the 
first operand and second operand are of a same size and each of the 
L data elements and L control elements are of a same size, and 
wherein each one of the L control elements is divided into three 
portions, the first portion being a flush to zero bit occupying the 
most significant bit of each control element, the second portion 
being a position selection field that is at least log2L bits wide and 
indicates a position of one of said L data elements, and a third 
portion; for each control e l e m e nt , storing a resultant operand in 
said first register having L resultant data elements of the same size 
as the L data elements and the L control elements, wherein the 
value of each resultant data element is controlled by the position 
selection field of the L control elements in the same position as the 
resultant data element, and is either, 

shuffling data from a first op e rand the one of the L data elements 
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designated by said the position selection field of said control 
element to an associated r e sultant data clement position if 
i£s- said control element's flush to zero bit field is not set; 
or and 

placing a zero into said associated resultant data clement position if 
its said control element's flush to zero bit field is net set. 

2. (Cancelled) 

3. (Cancelled) 

4. (Currently Amended) The method of claim 1 3 wherein said control 
element is to designate a first operand data element by a data element position 
number. 

5. (Cancelled) 

6. (Cancelled) 

7. (Currently Amended) The method of claim 1 2 further comprising 
outputting a resultant data block comprising data that was shuffled from said 
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first operand in response to said control elements of said second operand. 

8. (Original) The method of claim 1 wherein each of said data 
elements comprises a byte of data. 

9. (Original) The method of claim 8 wherein each of said control 
elements is a byte wide. 

10. (Original) The method of claim 9 wherein L is 8 and wherein said 
first operand, said second operand, and said resultant are each comprised of 64- 
bit wide packed data. 

11. (Original) The method of claim 9 wherein L is 16 and wherein said 
first operand, said second operand, and said resultant are each comprised of 128- 
bit wide packed data. 

12. (Currently Amended) An apparatus comprising: 

an execution unit to execute a single packed shuffle instruction including 
designating, with 3 bits, a first register storing a first operand 
comprised of a set of L data elements and designating, with 3 bits, a 
second register storing a second operand comprised of a set of L 
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control elements, wherein the first operand and second operand are 
of a same size and each of the L data elements and L control 
elements are of a same size, and wherein each one of the L control 
elements is divided into three portions, the first portion being a 
flush to zero bit occupying the most significant bit of each control 
element the second portion being a position selection field that is at 
least log2L bits wide and indicates a position of one of said L data 
elements, and a third portion, said shuffle instruction to cause said 
execution unit to :for each individual control clement, store a 
resultant operand in said first register having L resultant data 
elements of the same size as the L data elements and the L control 
elements, wherein the value of each resultant data element is 
controlled by the position selection field of the L control elements 
in the same position as the resultant data element, and is either, 
determine whether its flush to zero field is set, and place a zero into 
an associated resultant data clement position if said control 
element's fl ush to zero bit is true, otherwise 
shuffle data from a first operand the one of the L data elements 
designated by said the position selection field of said 
individual control element to said associated resultant data 
element position . 
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13. (Original) The apparatus of claim 12 wherein each of said L control 
elements occupies a position in said second operand and is associated with a 
similarly located data element position in a resultant. 

14. (Original) The apparatus of claim 13 wherein each individual 
control element is to designate a first operand data element by a data element 
position number. 

15. (Cancelled) 

16. (Cancelled) 

17. (Currently Amended) The apparatus of claim 12 16 wherein said 
shuffle instruction is to further cause said execution unit to generate a resultant 
having L data element positions that have been filled based on said set of L 
control elements. 

18. (Original) The apparatus of claim 12 wherein each of said data 
elements comprises a byte of data and each of said control elements is a byte 
wide. 



Appl. No. 10/611,344 



6 



Atty. Docket No. 42P15762 



19. (Original) The apparatus of claim 18 wherein L is 8 wherein said 
first operand, said second operand, and said resultant are each comprised of 64- 
bit wide packed data. 

20. (Original) The apparatus of claim 18 wherein L is 16 and wherein 
said first operand, said second operand, and said resultant are each comprised of 
128-bit wide packed data. 

21. (Currently Amended) An article of manufacture comprising a 
machine readable medium that stores data , that when accessed by a machine, 
causes the machine to perform operations r e pr e s e nting a predetermined function 
comprising: 

responsive to receiving a single packed shuffle instruction designating, 
with 3 bits, a first register storing a first operand having a set of L 
data elements and designating, with 3 bits, a second register storing 
a second operand having a set of L control elements , wherein the 
first operand and second operand are of a same size and each of the 
L data elements and L control elements are of a same size, and 
wherein each one of the L control elements is divided into three 
portions, the first portion being a flush to zero bit occupying the 
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most significant bit of each control element, the second portion 
being a position selection field that is at least log2L bits wide and 
indicates a position of one of said L data elements, and a third 
portion; for each control clement , storing a resultant operand in 
said first register having L resultant data elements of the same size 
as the L data elements and the L control elements, wherein the 
value of each resultant data element is controlled by the position 
selection field of the L control elements in the same position as the 
resultant data element, and is either, 

shuffling data from a fir3t op e rand the one of the L data elements 
designated by said the position selection field of said control 
element to an associated resultant data element position if 
said control element's flush to zero bit field is not set; 

or and 

placing a zero into said associated resultant data clement position if 
its said control element's flush to zero bit field is ftet set. 

22. (Currently Amended) The article of manufacture of claim 21 
wherein said data stored by said machine readable medium represents an 
integrated circuit design, which when fabricated performs said predetermined 
function in response to a single instruction. 
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23. (Currently Amended) The article of manufacture of claim 22 
wherein said predetermined function machine readable medium further includes 
data, that causes the machine to perform operations further compris e s 
comprising: 

generating a resultant having L data element positions that been filled in 
accordance to said set of L control elements. 

24. (Currently Amended) The article of manufacture of claim 23 
wherein each of said L control elements is associated with a similarly located 
data element position in a resultant. 

25. (Currently Amended) The article of manufacture of claim 24 
wherein each individual control element is to designate a first operand data 
element by a data element position number. 

26. (Currently Amended) The article of manufacture of claim 25 
wherein each of said data elements comprises a byte of data. 

27. (Cancelled) 
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28. (Cancelled) 



29. (Original) The article of manufacture of claim 21 wherein said data 
stored by said machine readable medium represents a computer instruction, 
which, if executed by a machine, causes said machine to perform said 
predetermined function. 

30. (Currently Amended) A method comprising: 

responsive to receiving a single packed shuffle instruction designating, 
with 3 bits, a first register storing a first operand having a set of L 
data elements ; r e c e iving and designating, with 3 bits, a second 
register storing a second operand having a set of L masks, wherein 
the first operand and second operand are of a same size and each of 
the L data elements and L masks are of a same size, and wherein 
each one of the L masks is divided into three portions, the first 
portion being a flush to zero bit occupying the most significant bit 
of each control element, the second portion being a position 
selection field that is at least log2L bits wide and indicates a 
position of one of said L data elements, and a third portion, and 
wherein each of said L masks occupies a particular position in said 
second operand and is associated with a similarly located data 
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element position in a resultant operand, each of said L masks to 
include a flush to zero field; storing the resultant operand in said 
first register having L resultant data elements of the same size as 
the L data elements and the L masks, wherein the value of each 
resultant data element is controlled by the position selection field of 
the L masks in the same position as the resultant data element, and 
is either, 

for each mask, determining whether a zero if its said mask's flush 
to zero bit field is set , and placing a zero into an associated 
resultant data cl e ment position if true ; or and 

if its said mask's flush to zero bit field is not set, shuffling data 
from a first operand the one of the L data elements 
designated by said the position selection field of said mask 
to said associated resultant data element position. 

31.-33. (Cancelled) 

34. (Currently Amended) The method of claim 30 33 wherein said first 
operand, said second operand, and said resultant are each comprised of 64-bit 
wide packed data. 
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35. (Currently Amended) The method of claim 30 33 wherein said first 
operand, said second operand, and said resultant are each comprised of 128-bit 
wide packed data. 



36. (Currently Amended) A method comprising: 

responsive to receiving a single packed shuffle instruction designating, 
with 3 bits, a first register storing a first operand having a set of L 
data elements ; r e ceiving and designating, with 3 bits, a second 
register storing a second operand having a set of L shuffle masks, 
wherein the first operand and second operand are of a same size 
and each of the L data elements and L masks are of a same size, and 
wherein each one of the L shuffle masks is divided into three 
portions, the first portion being a flush to zero bit occupying the 
most significant bit of each control element, the second portion 
being a position selection field that is at least log2L bits wide and 
indicates a position of one of said L data elements, and a third 
portion, and wherein each of said L shuffle masks is associated 
with a similarly located data element position in a resultant 
operand, each of said L masks to includ e a flush to zero field; 
storing the resultant operand in said first register having L 
resultant data elements of the same size as the L data elements and 
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the L masks, wherein the value of each resultant data element is 
controlled by the position selection field of the L individual masks 
in the same position as the resultant data element, and is either, 
for each individual 3hufflc mask, determining whether a zero if its 
said mask' s flush to zero bit field is set , and placing a zero 
into an associated resultant data clem e nt position if true , 
otherwise 

shuffling data from a first operand the one of the L data elements 
designated by said the position selection field of said 
individual shuffle mask to said associated resultant data 
element position. 

37. (Cancelled) 

38. (Cancelled) 

39. (Currently Amended) An apparatus comprising: 

a first memory location to store a plurality of source data elements; 

a second memory location to store a plurality of control elements, each of 
said control elements to correspond to a resultant data element 
position, and wherein each one of said control elements to include 
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is divided into three portions, the first portion being a flush to zero 
bit field and a selection field occupying the most significant bit of 
each control element, the second portion being a position selection 
field that is at least log2L bits wide and indicates a position of one 
of said L data elements, and a third portion ; 

control logic coupled to said first memory location and said second 

memory location, said control logic in response the receipt of a 
single packed shuffle instruction designating, with three bits, a first 
memory location storing a first operand having a set of L data 
elements and designating a second memory location storing a 
second operand having a set of L control elements, wherein the first 
operand and the second operand are of a same size and each of the 
L data elements and L control elements are of a same size, to values 
of said control elements to generate a plurality of selection signals 
and a plurality of flush to zero signals , a zero signal generated 
when a control element's flu sh to zero bit is set; 

a first plurality of multiplexers coupled to said first memory location and 
said plurality of selection signals, each of said first plurality of 
multiplexers to shuffle a store a resultant operand in said first 
memory location having L resultant data elements of the same size 
as the L data elements and the L control elements, wherein the 
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value of each resultant data element is controlled by the position 
selection signal of the L control elements in the same position as the 
resultant data element, and is the one of the L data elements for a 
specific resultant data element position in response to a selection 
signal corresponding to said specific resultant data element 
position; and 

a second plurality of multiplexers coupled to said first plurality of 

multiplexers and to said plurality of flush to zero signals, each of 
said second plurality of multiplexers associated with a specific 
resultant data element position, each of said second plurality of 
multiplexers to output a zero if its flush to zero signal is active or to 
output a data element shuffled for that specific resultant data 
element position. 

40. (Original) The apparatus of claim 39 wherein said plurality of 
source data elements is a first packed data operand. 

41. (Original) The apparatus of claim 40 where said plurality of control 
elements is a second packed data operand. 

42. (Original) The apparatus of claim 40 wherein said first and second 
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memory locations are a single instruction multiple data registers. 

43. (Original) The apparatus of claim 42 wherein: 

said first packed operand is 64 bits long and each of said source data 

elements is a byte wide; and 
said second packed operand is 64 bits long and each of said control 

elements is a byte wide. / 

44. (Original) The apparatus of claim 42 wherein: 

said first packed operand is 128 bits long and each of said source data 

elements is a byte wide; and 
said second packed operand is 128 bits long and each of said control 

elements is a byte wide. 

45. (Currently Amended) An apparatus comprising: 

control logic to receive a single packed shuffle instruction designating, 
with three bits, a first memory location storing a first operand 
having a set of M data elements and designating, with three bits, a 
second memory location storing a second operand having a set of L 
shuffle masks, wherein each of the M data elements and L shuffle 
masks are of a same size, and wherein each one of the L shuffle 
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masks is divided into three portions, the first portion being a flush 
to zero bit occupying the most significant bit of each shuffle mask, 
the second portion being a position selection field that is at least 
log2L bits wide, and a third portion, and wherein each shuffle mask 
is associated with a unique resultant data element position 
controlled by the position selection field of said shuffle mask, said 
control logic to provide a select signal and a flush to zero signal for 
each resultant data element position; 
a set of L multiplexers coupled to said control logic, wherein each 

multiplexer is also associated with a unique resultant data element 
position, each multiplexer to output to said first memory location 
either, 

a zero if its associated said shuffle mask's flush to zero signal is 
active or and 

to output data shuffled from a set of the one of the M data elements 
designated by the b ased on its associated select signal of said 
shuffle mask if its associated said shuffle mask's flush to 
zero signal is not inactive. 

46. (Original) The apparatus of claim 45 further comprising a register 
with L unique data element positions, each data element position to hold an 
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output from its associated multiplexer. 

47. (Original) The apparatus of claim 46 wherein L is 16 and M is 16. 

48. (Currently Amended) A system comprising: 
a memory to store data and instructions; 

a processor coupled to said memory on a bus, said processor operable to 
perform a shuffle operation, said processor comprising: 
a bus unit to receive an single packed shuffle instruction^ from said 
memory, said instruction to cause a data shuffle on at least 
one of designate, with 3 bits, a first register storing L data 
elements from a first operand bas e d on a s e t o f , and to 
designate, with three bits, L shuffle control elements from a 
second operand , wherein the first operand and second 
operand are of a same size and each of thee L data elements 
and L control elements are of a same size, and wherein each 
one o the L control elements is divided into three portions, 
the first portion being a flush to zero bit occupying the most 
significant bit of each control element, the second portion 
being a position selection field that is at least log2L bits wide 
and indicates a position of one of the L data elements, and a 
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third portion; 

an execution unit coupled to said bus unit, said execution unit to execute 
said single packed shuffle instruction, said single packed shuffle 
instruction to cause said execution unit to: 

for each shuffle control el e ment, store a resultant operand in said 
first register having L resultant data elements of the same 
size as the L data elements and the L control elements, 
wherein the value of each resultant data element is 
controlled by the position selection field of the L control 
elements in the same position as the resultant data element, 
and is either, 

shuffle data from a first op e rand the one of the L data 
elements designated by said shuffle the position selection 
field of said control element to an associated resultant data 
element position if said control element's flush to zero 
bit fiete is not set; or and 

place a zero into said associated resultant data clement 

position if its said control element's flush to zero bit 
field is net set. 
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49.-51. (Cancelled) 



52. (Original) The system of claim 48 wherein each data element is a 
byte wide, each shuffle command element is a byte wide, and L is 8. 



53. (Original) The system of claim 48 wherein said first operand is 64 
bits long and said second operand is 64 bits long. 
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